Neural Networks for 2D to 3D Human Pose Estimation
نویسندگان
چکیده
Neural networks have nowadays been applied to almost every conceivable problem in computer vision. Human pose estimation is another long standing problem which is now being revisited thanks to this novel technique. This work investigates two questions repeatedly asked when applying neural networks: how to obtain enough data to train them, and more fundamentally how to leverage structure of the problem in the design of the appropriate network? In the context of human pose estimation we propose a method of data augmentation by adding noise in the joint angle values of the pose to address the first issue, and we use regression of joint angles followed by application of forward kinematics algorithm to obtain 3D positions of the joints, like [25], to address the latter. Gradients can be back-propagated through the forward kinematics and hence such layer can be used in the end-to-end training of the neural network. Such approach enforces some of the structural constraints of the human body rather than performing estimation of multiple independent 3D positions. We have found that data augmentation is highly beneficial for the performance of the system. We also found that forward kinematics layer does not further improve the performance, at least in our experimental setting.
منابع مشابه
3D Human Pose Estimation Using Convolutional Neural Networks with 2D Pose Information
While there has been a success in 2D human pose estimation with convolutional neural networks (CNNs), 3D human pose estimation has not been thoroughly studied. In this paper, we tackle the 3D human pose estimation task with end-to-end learning using CNNs. Relative 3D positions between one joint and the other joints are learned via CNNs. The proposed method improves the performance of CNN with t...
متن کاملHand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study
Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...
متن کاملA Dual-Source Approach for 3D Human Pose Estimation from a Single Image
In this work we address the challenging problem of 3D human pose estimation from single images. Recent approaches learn deep neural networks to regress 3D pose directly from images. One major challenge for such methods, however, is the collection of training data. Specifically, collecting large amounts of training data containing unconstrained images annotated with accurate 3D poses is infeasib...
متن کاملV2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map
Most of the existing deep learning-based methods for 3D hand and human pose estimation from a single depth map are based on a common framework that takes a 2D depth map and directly regresses the 3D coordinates of keypoints, such as hand or human body joints, via 2D convolutional neural networks (CNNs). The first weakness of this approach is the presence of perspective distortion in the 2D dept...
متن کاملUnsupervised Adversarial Learning of 3D Human Pose from 2D Joint Locations
The task of three-dimensional (3D) human pose estimation from a single image can be divided into two parts: (1) Two-dimensional (2D) human joint detection from the image and (2) estimating a 3D pose from the 2D joints. Herein, we focus on the second part, i.e., a 3D pose estimation from 2D joint locations. The problem with existing methods is that they require either (1) a 3D pose dataset or (2...
متن کامل